Richard Hudson and Norman Kaplan on the Coalescent Process.
نویسنده
چکیده
MANY important questions in genetics involve looking back in time using data sampled in the present. The coalescent process describes the ancestry of a sample of genes: as lineages trace back, they coalesce at a rate inversely proportional to the effective population size. This remarkably simple approximation now dominates population genetics, both because it gives a direct intuition into the evolutionary process and because it allows efficient simulation. Thecoalescentpredictsthegenealogicalrelationshipsbetween sampledgenes,and itdependsonlyonthe effectivepopulationsize, regardless of the detailed life history. The coalescent was first described by Kingman (1982), butwas developed independently by Hudson (1983) and Tajima (1983); its influence on population genetics came primarily through Hudson. The coalescent is rooted in older concepts: Malécot’s (1948) idea of identity by descent, which is central to quantitative genetics (Kempthorne 1954), and the diffusion approximation (Kimura 1955), which depends on the same effective population size. The coalescent emergednot through radically newconcepts, but, rather, because it gives a natural way to analyze the samples of DNA sequences thatwere just becoming available. Hudson and Kaplan (1988), with its companion (Kaplan et al. 1988), extended the coalescent to include selection and used it to interpret data on sequence variation around the alcohol dehydrogenase (Adh) locus of Drosophila melanogaster (Kreitman 1983). Under the coalescent, lineages coalesce at a rate equal to the inverse of the effective number of genes in the population, 1/2Ne. Migration, recombination, and mutation can be included by allowing ancestral lineages to jump between locations, genetic backgrounds, or allelic states; this extension is known as the structured coalescent (Hudson 1983). Selection ismuchharder to incorporate because the ancestry nowdepends on the allelic state. However, Kaplan et al. (1988) showed that, if the selected backgrounds are taken as given, then the structured coalescent describes the genealogy of linked neutral alleles; random fluctuations in the frequency of the selected backgrounds can be treated by a diffusion that couples to the coalescent. A surprising conclusion from thismethod is that selection has to be extremely strong relative to random drift to distort neutral genealogies (Barton and Etheridge 2004). This is a fundamental obstacle to detecting selection from sequence data. The fast and slow (F/S) alleles of the Adh locus of D. melanogaster differ by a single amino acid. Kreitman (1983) sequenced 11 copies of the locus and found a sharp peak of polymorphism around the amino-acid difference. This is consistent with maintenance of these alleles by longterm balancing selection. Hudson and Kaplan (1988) showed that divergence between the F and S alleles is consistent with balancing selection, albeit with a lower-than-average recombination rate. However, there was also excess variation among the S alleles, which was not expected. Even though Adh in Drosophila is one of the most intensively studied polymorphisms, we still do not know how its sequence variation has been shaped by selection (Begun et al. 1999). Population genetics is now focused on understanding the abundance of sequence data that has recently become available. Since the first work of Kreitman (1983), the goal has been to infer the nature and strength of selection across the genome directly from the DNA sequence; “genome scans” of the kind introduced by Hudson and Kaplan (1988) are now Copyright © 2016 by the Genetics Society of America doi: 10.1534/genetics.116.187542 Photo of Norman Kaplan (left) courtesy of Jotun Hein. Photo of Richard Hudson (right) courtesy of himself. 1Address for correspondence: IST Austria, Am Campus 1, A-3400 Klosterneuburg, Austria. E-mail: [email protected]
منابع مشابه
The coalescent process in models with selection.
Statistical properties of the process describing the genealogical history of a random sample of genes are obtained for a class of population genetics models with selection. For models with selection, in contrast to models without selection, the distribution of this process, the coalescent process, depends on the distribution of the frequencies of alleles in the ancestral generations. If the anc...
متن کاملThe structure of genealogies in the presence of purifying selection: a fitness-class coalescent.
Compared to a neutral model, purifying selection distorts the structure of genealogies and hence alters the patterns of sampled genetic variation. Although these distortions may be common in nature, our understanding of how we expect purifying selection to affect patterns of molecular variation remains incomplete. Genealogical approaches such as coalescent theory have proven difficult to genera...
متن کاملThe Ancestry of a Gene
Introduction Gene fixation in the sense that there is a single ancestor from which all the base pairs in all the copies of a gene in the population are descended only occurs in small (N < 1000) populations. In large populations (N > 1 000 000) crossing over (recombination) within the gene provides that there is an ancestral pool rather than a single ancestor of the gene. In the absence of recom...
متن کاملSIMCOAL: a general coalescent program for the simulation of molecular data in interconnected populations with arbitrary demography.
SIMCOAL (version 1.0) is a computer program for the simulation of molecular genetic diversity in an arbitrary number of haploid populations examined for a set of fully linked loci. It is based on the retrospective coalescent approach initially described by Kingman (1982a,b), and clearly exposed in a series of other articles (Donnelly and Tavaré 1995; Ewens 1990; Hudson 1990). The coalescent bac...
متن کاملThe coalescent process in models with selection and recombination.
The statistical properties of the process describing the genealogical history of a random sample of genes at a selectively neutral locus which is linked to a locus at which natural selection operates are investigated. It is found that the equations describing this process are simple modifications of the equations describing the process assuming that the two loci are completely linked. Thus, the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Genetics
دوره 202 3 شماره
صفحات -
تاریخ انتشار 2016